HMM-Based Speech Enhancement Using Pitch Period Information in Voiced Speech Segments
نویسنده
چکیده
An extension of the HMM-based speech enhancement approach [1] is presented. The HMM-based scheme uses hidden Markov models (HMM) to control a state-dependent Wiener filter, which is used to process the noisy speech signal. This scheme gives enhanced speech signals without the annoying tonal artefacts (‘musical noise’) of the spectral subtraction approach. However, parts of the enhanced signal often sound rough or hoarse. In this paper, it is shown that this effect occurs because the noise between the harmonics of voiced speech segments is not removed by the Wiener filter. An algorithm is proposed, which uses pitch period information, and which is based on least squares (LS) estimation, to remove these noise components. Moreover, it is shown that the estimation involving low-energy states of the speech HMM is not reliable, and therefore a noise floor is inserted during low-energy speech segments instead of filtering the signal.
منابع مشابه
Speech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملA Segmental HMM for Speech Waveforms
We present a purely time domain approach to speech processing which identifies waveform samples at the boundaries between glottal pulse periods (in voiced speech) or at the boundaries between unvoiced segments. An efficient algorithm for inferring these boundaries is derived from a simple probabilistic generative model of speech and state of the art results are presented on pitch tracking, voic...
متن کاملPitch estimation using mutual information
A spectrotemporal method based on Mutual Information (MI) is proposed for pitch estimation of voiced speech signals. We use MI as the similarity measure between voiced speech segments and their delayed version. Instead of measuring linear dependencies, MI measures statistical dependency, which suits the dynamic characteristic of speech signals. Besides, higher-order statistics are directly enco...
متن کاملUsing Noisy Speech to Study the Robustness of a Continuous F0 Modelling Method in HMM-based Speech Synthesis
In parametric text-to-speech synthesis using Hidden Markov Model (HMM), the fundamental frequency (F0) parameter modelling is important because it has a direct effect on the prosody of synthetic speech. F0 is typically modelled by a discrete distribution for unvoiced speech and a continuous distribution for voiced, by using a multi-space distribution (MSD). However, F0 modelling using MSD-HMM i...
متن کاملPitch synchronized speech processing (PSSP) for speaker recognition
A method for speech signal enhancement is developed with application to automatic speaker recognition where the signals have different channel conditions. The basis of this technique is a robust pitch detection algorithm that accurately estimates the instantaneous pitch rate, and extracts single pitch period speech segments. This technique of pitch synchronized speech processing (PSSP) provides...
متن کامل